Software of the Month Club 1996 August

home *** CD-ROM | disk | FTP | other *** search

/ Software of the Month Club 1996 August / Software of the Month Club 1996 August.iso / pc / dos / edu / kwikstat / xxks1.lzh / KS4.DOC < prev next >

Wrap

Text File | 1995-08-16 | 146.7 KB | 3,542 lines

TexaSoft's USING KWIKSTAT 4.1 Reference Guide, Condensed Disk Version (C)Copyright 1995 Alan C. Elliott (C)Copyright 1991, 92, 93, 94, 95 Alan C. Elliott For additional information on this product, contact TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75106-1169 (214) 291-2115, Fax: (214) 291-3400, Compuserve:70721,3145, Internet: 70721.3145@compuserve.com. Production team: Alan Elliott, Marcia Stoesz, Paul Witt, Nancy Witt, Carol Bigler, Doug Pollock, Melanie Walker. Program Testing Team: Leo Bolta, Shopsy's Foods, Canada Randy Hamlin, Senior Scientist, R&D Dept., Hunt-Wesson, Inc. Dick Hawkins, Hunt-Wesson, Inc. Jack Holloway III, University of Phoenix Robert Jirsa, Southern CT State University Chip Kloos, Lab Manager, R&D Dept., Hunt-Wesson, Inc. Victor L. Landry, Ph.D., The Mercy Hospital of Pittsburgh, PA Gerard Leboucher,Labo Ethologie Et Psychoph, France Allen Lein, Professor Emeritus, UCSD Randolph Maheux, Tampa, FL Sukanya Misra, SMU Sanford Moos, Lab Manager, Enzo Labs Joseph Padgett, Raleigh, NC George Sadler, Ph.D., USC Prof. Emeritus, Okla. City, OK Dr. Karl-August Schaeffer, Cologne, Germany Michael Stratil, Ph. D., Pembroke State University Wayne Woodward, Ph. D., SMU Carter Yeager, Boston University All rights reserved. No part of this book may be reproduced without prior permission. For information, address TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75106-1169. No patent liability is assumed with respect to the use of the information contained herein. While every precaution has been taken in the preparation of this publication, the publisher assumes no responsibility for errors or omissions. Neither is any liability assumed for damages resulting from the use of the information herein. The KWIKSTAT software and manual ("documentation") are copyrighted by TexaSoft and are protected by both United States copyright and International treaty provisions. Before you can use this program on an on-going basis, you must pay a license fee. An order form is on disk in the file named KSORDER.TXT Important Information in the file LATENEWS.DOC: On the KWIKSTAT disk is a file named LATENEWS.DOC, which contains information about the program that is not in this documentation. -------------------------------------------------------------------- Please Become a Registered User 1 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- =========================== AN OVERVIEW OF KWIKSTAT 4.1 =========================== KWIKSTAT is a statistical data analysis program. It was designed by professional statistical consultants and researchers to allow you to quickly and easily use the most commonly needed statistical data analysis procedures and graphs. KWIKSTAT REQUIREMENTS KWIKSTAT is a DOS program and may be run as either a stand alone application, or as a DOS window executed from within the "Windows" operating environment. It requires PC-DOS or MS-DOS version 3.0 or higher. If you are running Windows, version 3.1 or higher is required. Your computer should contain at least 512K or more of free RAM. KWIKSTAT graphics require an EGA or VGA compatible monitor. Many printers are supported. A mouse is optional. INSTALLATION In most cases, for a quick installation to a hard disk, you enter the command: INSTALL Follow the instructions on the screen. This procedure will install the program to a disk, and will automatically run the KWIKSTAT setup program. After installing KWIKSTAT, you can begin the program from the DOS prompt by entering the command KS. See the document on disk named KSWINDOWS.DOC for information on installing KWIKSTAT as a Windows icon. USING THE KWIKSTAT MENUS KWIKSTAT menus are similar to a windows menu system. The main menu bar contains five options: File, Edit, Analyze, Helps and About. Using the right and left arrow keys on the cursor pad, you can move the menu selection to one of the other menu bar options. Pressing the right arrow key once moves the menu bar option from File to Edit. The File pull-down menu vanishes and the Edit pull-down menu appears. Pressing the left arrow key moves the selection back to the File menu. Or, point to a menu option with the mouse and click. To select options from an extended menu (pulled-down), use the up and down arrow keys to highlight the option you desire, then press the Enter key. Or, press the first letter of the option name. If you are using a mouse, point to the selection with the mouse pointer and click. Here is brief description of each menu: -------------------------------------------------------------------- Please Become a Registered User 2 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- USING THE FILE MENU Before you can create a graph or calculate statistics, you must create a database and enter data. The options are: NEW DATABASE - You must create a new database and enter data before doing any analysis or creating a graph. The program creates and reads dBase (.DBF) type file format databases. OPEN A DATABASE - Open an existing database. A database must be opened so the program will know where the data is located. SUBSET DATABASE - Creates a new database that is a subset of the current database. COPY/BACKUP - Creates a copy of the database.It is useful to create a duplicate copy in case the original copy is damaged. LIST (DISPLAY) THE CONTENTS OF THE DATABASE - Displays the information in the database to the screen. MODIFY OR DISPLAY DATABASE STRUCTURE - Allows you to view or change characteristics about the database, including field widths and types. KILL - Deletes a database file from your disk. FILE UTILITIES - Imports information, creates reports, or outputs data. EXIT - Ends the program. USING THE EDIT MENU The EDIT menu contains options that allow you to enter new data into a database, edit data currently in a database, and other editing options: EDIT RECORDS - Change data already in the database. APPEND RECORDS - Add new records to the database. MISSING VALUE CODES - Define missing value codes for your database. Refer to the section titled "Setting Missing Value Codes". PACK DATABASE - Permanently erase all records marked for delete. ZAP - Get rid of all records in a database. -------------------------------------------------------------------- Please Become a Registered User 3 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- USING THE ANALYZE MENU The KWIKSTAT ANALYZE menu allows you to choose which analysis module to run. The menu items on the ANALYZE menu are: Descriptive Statistics Graphs - Descriptive & Comparative XYZ Visualization/Spin Plot t-tests and Analysis of Variance (ANOVA) Nonparametric Comparisons Regression & Correlation Crosstabulations, Frequencies, Chi Square Life Tables and Survival Analysis Data Generation, Concepts and Simulations 2-Way Advanced ANOVA Designs (Professional edition) Multiple Comparisons (Professional edition) Advanced Regression (Professional edition) Time Series Analysis (Professional edition) Quality control graphs and charts (Professional edition) Pareto Charts (Professional edition) USING THE HELP MENU The KWIKSTAT Help system contains items to help you operate the program. These include: PROGRAM HELP - Contains general program help information TUTOR - Learn how to use the program DECIDE WHAT ANALYSIS TO USE CHANGE SETUP OPTIONS - Color, Printer, etc. AUTOHELP/Hints (On or Off) GO TO DOS, Return with Exit (Shell) TUTORIAL:TRY THIS EXAMPLE This short tutorial will give you a feeling for how to use KWIKSTAT. It will assume you are using KWIKSTAT on a hard disk. To begin KWIKSTAT, you must first be in the \KS4 directory on your hard disk. Use the CD (Change Directory) command from the DOS prompt to change to the \KS4 directory by using the command: CD\KS4 (or the directory where you installed KWIKSTAT.) Once in the \KS4 directory, begin KWIKSTAT with the KS command: KS The FILE pull-down menu will appear. (If the ANALYZE pull-down menu appears, press the left arrow key twice to open the FILE menu.) This example will use data already stored in a dBase ".DBF" file named -------------------------------------------------------------------- Please Become a Registered User 4 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- EXAMPLE. Follow these steps: Step 1 Select a data file: Open the database named EXAMPLE by selecting the Open a database option from the FILE menu, then choose EXAMPLE from the list of files. (If the EXAMPLE database does not appear on the list of databases, you may not have installed the program correctly.) Once the database is opened, a notice at the bottom of the screen tells you that the database named EXAMPLE is open, and it contains 50 records. Step 2 View the contents of the database: Choose the List (display) option from the menu. Step 3 Choose the ANALYZE pull-down menu: From the ANALYZE pull-down menu, choose the Graphs option. KWIKSTAT now switches to the Graphs module (which may take a few seconds). Soon, you will see the Graphs module menu. Step 4 Select a plot type: From the Graphs menu, choose the TIME SERIES PLOT option. The program now displays a screen with the fields available for use by the graph. The field names appear in a pick box to the left of the screen, and an empty box appears to the right of the screen. Step 5 Select fields to plot: From the left screen (list of fields), choose which fields to plot in the time series plot. Choose the fields TIME1 and TIME2. When you choose a field, it will appear in the "Fields to Graph" box. After you choose the two fields, choose "Finished Choosing Fields." After a few seconds, a time series plot will appear containing two lines. The menu above the graph may be used to select user definable features of the graphic display. The options include: EXIT - exit the graph and return to the module menu. OPTIONS - choose options for the graph including title, axis names, footnote, and other options. PRINT - print the graph to your printer. CAP/PCX - capture the graph as a PCX file. GET COLORS - choose colors to be used to display the graph. +/- Smartpoint(tm) pointer used to select a datapoint on the screen. Information about the selected point will be displayed. (see step 7.) -------------------------------------------------------------------- Please Become a Registered User 5 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 6 Control the plot menu: On all graphic screens in KWIKSTAT, a similar menu will appear at the top of the screen. If you want the menu to temporarily vanish, press the spacebar. Press the spacebar again for the menu to reappear. The menu is active even if it is not shown. Step 7 Using SmartPoint(tm): If you are using a mouse, place the mouse pointer on a datapoint on the screen and click. Or, if you are not using a mouse, press the + (plus) key, then use the cursor keys to position the + over a datapoint on the screen, and press Enter. Information about the datapoint you selected will be displayed on the screen. This feature is particularly helpful in identifying points that are interesting or "outliers." Press Enter to return to the plot. Step 8 Select color options: Press G or click on the "get color" option. The color menu will appear. From this menu you can select color options. Most plots contain the following "get colors" menu: Menu - returns to the main graph menu graph - changes graph colors screen - changes background screen colors labels - changes label colors default - returns plot to original default colors b&w - displays plot in black and white help - displays help system Some "get color" menus also contain the option: tile - paints the plot using tiles rather than solid colors Step 9 End this example: Choose Menu to return to the Graph menu. Choose Exit to return to the main graph menu. Choose Exit again to return to the module menu. TIP: When you choose to print or capture the screen, the menu will vanish, so the menu will not appear on the printout or PCX graphic. NOTE: You can also go through an on-screen tutorial on how to create a chart by choosing the TUTORIAL option on the HELPS pull-down menu. EXAMPLE 2: DESCRIPTIVE STATISTICS This example uses the data stored in the database named EXAMPLE. Step 1 Open the database: To open the EXAMPLE database, select the Open a Database option, then choose the EXAMPLE database. Step 2 Select an analysis module: Open the ANALYZE pull-down menu and choose the Descriptive Statistics option from that menu. KWIKSTAT switches to the Descriptive Statistics module. -------------------------------------------------------------------- Please Become a Registered User 6 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 3 Choose analysis type: From the Descriptive Statistics menu, choose "Detailed statistics (single interval variable)." The program now displays the variables available for analysis from the database. Step 4 Select a field: Select the AGE variable. KWIKSTAT will perform calculations on the data in field AGE, and will produce a screen of descriptive statistics. At the bottom of this screen are several options. The Descriptive Statistics report screen menu options are: Exit - return to the main module menu graph - displays a graphic representation of the data view/print - allows you to print the information using the KWIKSTAT text viewer. ci - allows you to choose the confidence interval level. percent - allows you to choose what percentiles will be displayed. new var - returns you to the variable selection menu. Step 5 Display the graph: Choose the Graph option. This screen allows you to examine the distribution of numbers in the variable being analyzed (AGE in this case.) Several options on this menu that were not described in the previous example are: exit - returns to the descriptive statistics summary screen. mean CI off/on - displays a normal curve on the histogram, and shows where the mean and confidence interval is located on the box plot. print - prints the screen to the printer. + (plus) and -(minus) - redraws the histogram using more or fewer classes. dist. off/on -- allows you to display a cumulative distribution for the data. cap/PCX - captures the graph in a file using the PCX graph format. b&w or color - displays the graph using black and white or color. Step 6 End this tutorial: Choose Exit to return to the previous screen. Choose Exit again to return to the module menu. End KWIKSTAT by choosing the Quit option. This ends the tutorial. -------------------------------------------------------------------- Please Become a Registered User 7 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- =========================== USING THE KWIKSTAT DATABASE =========================== The FILE and EDIT pull-down menus are used to manage your data. Here is a roadmap to help you find the area of this chapter that will be most helpful to you according to your database needs: YOUR DATA IS ALREADY IN A DBASE DATABASE: If your data is already in a database, you may not have to use any of the procedures in this chapter. Simply copy your database into your KWIKSTAT data subdirectory, and it will be accessible immediately. If you have missing values, you should review the section titled "Missing Value Codes." If you have memo fields in your database, review the section called "Using dBASE files." YOU NEED TO ENTER YOUR DATA: If your data is not already in a file, you will need to create a database and enter your data. In this case, read the next few sections beginning with "An Overview of Database Creation and Design" and perform the two examples to learn how to create a KWIKSTAT database. Your data is in a non-dBASE file: If your data is already on the computer, but not in a dBASE (.DBF) file, you can usually import the data into KWIKSTAT. See the section titled "Using data from other programs" later in this chapter. DATABASE CREATION AND DESIGN The process for entering data and performing an analysis can be summarized in three steps: Step 1: Create a database. Step 2: Enter data into the database. Step 3: Choose an analysis option. OPTIONS FOR CREATING A NEW DATABASE Before you can enter data into a database, you must create a new database. The New Database option on the FILE Menu is used to create a new database. The structure, or layout, of a database must be described before you enter your data. KWIKSTAT allows you to create a new database in two ways: 1. Choose from a predefined structure or 2. Create a customized database structure. The following sections describe these two options. -------------------------------------------------------------------- Please Become a Registered User 8 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- CREATING A DATABASE FROM A PRE-DEFINED STRUCTURE A pre-defined structure is a "blank" database designed for a particular analysis type. Using a pre-defined database allows you to create a database for your analysis without having to worry about what fields are necessary, what type they should be, their width, and so on. The list below contains examples of some of the pre-defined database descriptions available when you choose to create a New Database. Choose the option that will create a database structure for the kind of analysis you will perform. (You may have to scroll to see some of the options.) CREATE A CUSTOMIZED DATABASE SINGLE VARIABLE, DETAILED STAT, HISTOGRAM, STEM & LEAF GROUPED HISTOGRAM, STATISTICS OR STEM & LEAF SIMPLE BAR CHART: LABEL AND VALUE PIE CHART: LABEL AND VALUE ETC... CREATING A CUSTOMIZED DATABASE If none of the pre-defined database structures meet your needs, you need to create a customized database. The following sections describe how you create a customized database structure to match your analysis needs. SPECIFYING FIELD TYPES REQUIRED BY AN ANALYSIS When you create a new database, you must specify certain information about each data field, including the field name, type, width and number of decimals (if any): 1. The FIELDNAME: A fieldname must be 1 to 10 characters in length and MUST begin with a character (a to z) and can contain letters, numbers and the underscore character "_". Upper and lower case DO NOT matter, since the name is always translated into all upper case. 2. The TYPE:Type may be . . . CHARACTER - May contain any characters. NUMERIC - Must contain numbers only. Example legal numbers are:1.00, -4.32, 6, 10000. Example illegal numbers are: 450-23-1232, $23.95, 40%. (For data like these, use the character type.) DATE or LOGICAL fields can be created in KWIKSTAT, but KWIKSTAT analyses will only use numeric and character fields. Thus, date and logical fields are treated the same as CHARACTER fields, except for -------------------------------------------------------------------- Please Become a Registered User 9 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- subsettings and in transformations (Replace). 3. The WIDTH of the field: Choose a width so that the maximum number of characters needed will fit into the field. For DATE or LOGICAL types, field widths are automatically set at 8 or 1 respectively. 4. DECIMALS:Decimals are only valid for numeric fields. This specifies to KWIKSTAT how many decimals to retain in the field. For example, if you wish to store numbers that are dollar prices, your data may look like "9999.99". This field would have a width of 7, with 2 decimals. KWIKSTAT DATABASE LIMITATIONS Maximum of 250 fields. Maximum width of a field name is 10 characters. Maximum width of a cell is 60 characters (15 for numbers). Dates are always 8 characters and logical fields are 1 character wide. Memo fields are not supported. DATABASE EXAMPLES This section provides you with two examples of creating a KWIKSTAT database. Go over these examples before creating your own database and performing your own analysis. Following these examples will answer a number of questions you may have about how to use KWIKSTAT. EXAMPLE 1, USING A PRE-DEFINED STRUCTURE This example shows how you would perform an independent group t-test in KWIKSTAT using one of the pre-defined database structures. In this example, 13 plants were randomly allocated to two groups. Group one received the present fertilizer and group 2 received a newer fertilizer. After a period of time, you observed the heights of the plants. The results are: Data for independent group t-test (fertilizer study) Present Newer Fertilizer 46.2 cm 51.3 cm 55.6 52.4 53.3 54.6 44.8 52.2 55.4 64.3 56.0 55.0 48.9 In order to enter this into a database, you must assign group numbers (or letters) to each group. For example, we will call the "Present Fertilizer" group 1 and the "Newer Fertilizer" group 2. -------------------------------------------------------------------- Please Become a Registered User 10 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- The database will include thirteen records (one for each plant) and two fields (one for the response and one for the group indicator). When entered into the database, the data will look like this: Group Height 1 46.2 1 55.6 1 53.3 1 44.8 1 55.4 1 56.0 1 48.9 2 51.3 2 52.4 2 54.6 2 52.2 2 64.3 2 55.0 Step 1 Create the Database: From the FILE Menu on the main KWIKSTAT menu screen select New database. Step 2 Name the Database: You will be prompted to enter the name of the database you are creating. Type a name (e.g., TTEST (maximum of 8 characters)) and press Enter. Step 3 Select a Structure type. A screen with the instruction "Choose the database type to create from the menu below" will appear. Since you are performing an independent group t-test, you can select the option titled INDEPENDENT GROUP T-TEST OR ANOVA from this list. This process automatically builds a database structure suitable for entering data for this kind of analysis. In this case, the database will contain a grouping field GROUP (where you will enter a 1 or 2, the fertilizer type) and an observation field OBS (where you will enter the height.) Step 4 Enter the Data: A data entry screen will appear where you will enter the data. In the spreadsheet type of entry, the field names are listed at the top of the screen, and the record numbers at the left side. The data you will enter in the first record is 1 (press Enter) and 46.2 (press Enter). When you type the 46.2 and press Enter, your cursor will automatically move to record number 2, where you will enter 1 and 55.6, and so on. Enter the data for the thirteen records. For each record of a "Present Fertilizer" observation, enter "1" for the GROUP variable. For the "Newer Fertilizer" observations enter a "2" for the GROUP variable. The eighth record is 2 and 51.3. Once all 13 records have been entered, the program will be waiting for a 14th record to be entered. -------------------------------------------------------------------- Please Become a Registered User 11 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Since there is no 14th record, press the F7 function key (Exit) to end the data entry process. KWIKSTAT will return to the Data main menu. Step 5 Performing the Analysis: From the main KWIKSTAT menu, select ANALYZE. From the ANALYZE pull-down menu, select t-tests and Analysis of Variance (ANOVA). The menu will appear. Select Compare independent groups (t-test, ANOVA). A field selection dialog box titled "Choose a grouping variable" will appear with the following options: Exit Choices GROUP (N) OBS (N) Select GROUP as the grouping field. Another dialog box will appear titled "Choose a data (numeric only) variable" with the same options. Select OBS, as the data variable (OBS contains the height data). Note: If you want to cancel the analysis, you would choose the Exit Choices option. KWIKSTAT will now perform the calculations and display another dialog box on the screen with the following options: A) View or print the calculated results B) Graphical Comparison Q) Quit, return to main menu Choose the "A)View..." option. The results of the analysis will appear on the screen in the KWIKSTAT viewer. Select Exit (F7) to exit the viewer and return to option menu. Choose "Q)Quit..." to return to the module menu. Select "X)Exit..." to return to the main KWIKSTAT menu. EXAMPLE 2, CREATING A CUSTOMIZED DATABASE STRUCTURE This example shows you how to enter data and perform some simple statistics and graphs. It will show you both the spreadsheet and database entry screens. The data that will be used is listed below. The GRADE variable is the grade received in the class, AGE is age, SEX is sex, WT is weight and SCORE is the score on a pre-test (maximum of 25 points). In database language, these variables are called fields. Step 1 Choose the New database option: From the FILE menu, choose the New Database option. You will be prompted to enter the name of the database. Enter the name MYDATA (a DOS compatible filename). Step 2 Choose Customize option: Once you have entered a filename for the database, a list of database structures will be displayed similar to the list on page 2-4. For this example, choose the CREATE A CUSTOMIZED DATABASE option. -------------------------------------------------------------------- Please Become a Registered User 12 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- GRADE AGE SEX WT SCORE 1 A 18 M 165 22.3 2 B 19 M 145 22.8 3 B 17 F 122 22.8 4 C 18 M 196 18.5 5 B 17 M 188 19.5 6 B 18 F 140 23.5 7 C 19 F 121 22.6 8 B 20 F 112 21.0 9 C 19 F 122 20.9 10 A 18 M 176 22.5 11 B 18 M 165 23.3 12 A 19 M 135 21.8 13 A 18 F 121 24.8 14 C 19 M 186 16.5 15 B 17 M 148 18.5 Step 3: Define the Database Structure: For each field (each item of data) in the database, you must specify the following information: A name of the field - something to identify it The type of data - is it numeric or character? The width of the field - enough to hold the biggest entry Decimal places - if needed For the data in this example, you will use the following information: Field name Type Width Dec GRADE C 2 AGE N 3 SEX C 2 WT N 4 SCORE N 5 1 The GRADE and SEX variables are of type "C" (Character) and the rest of the variables are numbers "N". Only the SCORE variable requires a decimal value. Enter the information about the database structure into the database definition screen. Step 4 Enter the data: A data entry screen will appear listing the names of all of the fields and an area to enter the data. KWIKSTAT includes two types of data entry screens, database type and spreadsheet type. In the Setup routine, you chose one of these two entry options. The following discussion shows you how to enter data in the spreadsheet screen. -------------------------------------------------------------------- Please Become a Registered User 13 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- TIP: You can toggle between spreadsheet entry mode and database entry mode by pressing the F8 (Switch) key. Step 4A Using a spreadsheet entry screen: The spreadsheet screen, as shown in figure 2.7 looks similar to a spreadsheet. If you prefer to use the database entry mode, skip to the section titled "Using a database entry screen." The names of the database fields (Grade, Age, etc.) are listed at the top of the screen (columns) and the record numbers are listed down the left side of the screen (rows). Initially, since you do not have any records entered into the database, the only row displayed is the -ADD- row, which indicates that you are adding a new record. To enter data into the database, begin typing the entry for the first field (GRADE). Type an A (upper case), then press Enter. Your cursor moves to the next field (AGE). Type 18 and press Enter. Type upper case M and press Enter. Continue until you have entered 22.3 in the SCORE field. When you press Enter after entering 22.3, a new row appears to allow you to enter the second record of information, and your cursor moves to the first field of this record. Continue entering information in the spreadsheet until all records are entered. If you make a mistake on a record, you can use the right or left arrow keys to move your cursor and correct the mistake. If you discover that you have made an error in a previous record skip to step 5 now. When you have finished entering the information in the database, end the entry procedure, press F7 (Exit.). Step 5 Correcting errors in the database: Before exiting the data entry screen, you can correct errors in data entry mode by using the F2 key to toggle into Edit mode. The edit screen is similar to the screen used to enter data. Use the cursor keys to move to the field to edit, and change the value. Exit the edit screen/data editor by selecting the F7 (Exit) command. You will return to the KWIKSTAT main menu. Step 6 Perform a Descriptive Statistics analysis: Once you have entered your data into the database, you are ready to perform one or more analyses. From the main KWIKSTAT menu, choose the ANALYZE pull-down menu. Choose the Descriptive Statistics option on the ANALYZE menu. After a few seconds, the Descriptive Statistics menu will appear. This menu lists the options you can choose from the Descriptive Statistics module. Step 7 Select Detailed Statistics: To calculate detailed statistics for a variable in your database, select the option called "Detailed Statistics (single interval variable)". A screen will appear prompting you to specify what field to use in the calculations. Select AGE. The detailed statistics screen for the selected variables will appear on the screen. Once you have examined these results, select Exit to exit -------------------------------------------------------------------- Please Become a Registered User 14 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- the screen. This output is discussed later in "Using Descriptive Statistics." USING THE KWIKSTAT FILE MENU The FILE Menu is the first menu you normally use when you begin KWIKSTAT. From this menu, you open a database or create a new database. Either way, you usually must have data in a database before you can perform an analysis. This section describes the options on the FILE Menu. NEW DATABASE When you choose New Database from the FILE Menu, you will be prompted to enter the name of the database. Enter a file name such as MYDATA. (Eight characters maximum.) Once you have entered a filename for the database, you can choose from a list of pre-defined database structures, or create your own. See Examples 1 and 2 later in this chapter for a tutorial on using a pre-defined structure or creating a customized database structure. OPEN A DATABASE The Open a database to use option on the FILE Menu allows you to access information in a dBASE file that you created in KWIKSTAT, in dBASE, or in any other program that creates .DBF files. If the database you want to use is not in the current (default) directory, you can temporarily change the default directory by selecting "Choose New Path (F2)". You will be asked to enter a path name (such as \DB3). Then, the .DBF files in that directory will be displayed in a list, and you can choose the database to use from that list. Another way to open a database that is not in the default directory is to enter it by name. To do this select "Enter Choice by Name (F3)". You will be prompted to enter the name of the database to use. For example, if the database you want to use is MYDATA.DBF and it is in the \DB3 directory, you would enter \DB3\MYDATA. Do not include the .DBF extension in the name. Once a database is open, you will see its name at the bottom left of the screen, along with the number of records in the database. Once a database is open, you can edit, pack, modify, set missing values, subset and list the database using the other options on the FILE Menu. SUBSET A DATABASE The Subset database option on the FILE Menu allows you to create a new database from an old one. The new database can be a subset of the old one, using a conditional criteria for outputting information from the old database to the new one. -------------------------------------------------------------------- Please Become a Registered User 15 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- For example, suppose you have a database with a field GROUP with values 1, 2, 3, 4 and 5. You want to create a database that does NOT include Group 5. After choosing Subset database from the FILE Menu, you are asked for the name of the new database. For example, your new database might be named NO5.DBF. You are asked for the field name to be used in the selection criteria. In this case, you would choose the field named GROUP. Next you must enter the selection relationship. It will be described as a numerical expression. The conditional operators you may use are: = (equal) > (greater than) < (less than) >= (greater than or equal to) <= (less than or equal to) <> (not equal to) For example, you would enter the condition GROUP <5 You can also use the logical operators .NOT., .AND., and .OR.. It is important that a dot (.) appear before and after each logical operator. For example, a conditional expression to include only groups 1 and 5 would be GROUP = 5 .OR. GROUP = 1 Other examples of conditional expressions are GROUP > STATUS GROUP < WEIGHT*HEIGHT TIME1 = TIME2*1.96 SEX <> 'F' TIME1 <=20 .AND. SEX = 'M' When you choose the Subset option from the FILE Menu, a Subset dialog box appears on the screen. There are two items you must enter in the Subset dialog box. First is a name for the new database. This must not be the same name as the current database. Then, you must enter the subset criteria. Examples of subsetting criteria are: Once you have entered the filename and condition, press the F7 key to begin the subset procedure. A database with the name you specified will be created, containing the records selected by your condition statement. Note: When creating conditional expressions for subsetting, use the functions described in the table "Database Calculator Functions" later. -------------------------------------------------------------------- Please Become a Registered User 16 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- COPY/BACKUP DATABASE FILE The Copy/Backup Database File option on the FILE menu allows you to quickly create a backup copy of your database information. This is recommended, particularly for large database because sometimes databases become corrupt or are made unusable for some other reason. LIST RECORDS The List (display) the contents of a database option on the FILE Menu allows you to look at the information in your database. MODIFY OR DISPLAY STRUCTURE The Modify or Display database structure option on the FILE Menu allows you to display the structure of your database, and allows you to change characteristics about the database structure. When you choose to display the structure, a list of all field names, their types, widths and decimals (if any) are listed. When you choose to modify a database, you are given a chance to modify the characteristics of each field. Your options are: Delete the Field Change Name of Field Change Type of Field Change Width of Field Change Number of Decimal Places If you change the type of field, say from character to numeric, the program will attempt to convert the contents of the field to its new type. When you modify a database, you will be asked to enter the name of a new database. This means that the modified database will be in a new file, and your old original database will remain intact. If you no longer want the old database, you must delete it by choosing the Kill database file option from the FILE Menu. KILL (DELETE) A DATABASE The Kill - Delete database file option allows you to delete a database and its related missing values files (if any.) FILE UTILITIES - EXPORT, IMPORT, REPORT This menu provides several data utilities. When you choose this option, the KWIKSTAT utilities menu appears. It allows you to choose from these options: Export data to an SDF file -- This option allows you to write out the data in a database to an ASCII file. This option is useful when you are wanting to transfer data from your KWIKSTAT database to another -------------------------------------------------------------------- Please Become a Registered User 17 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- program that does not read .DBF files. See "Exporting Data" later in this chapter. Import data from a Comma Delimited file or from a 1-2-3 file -- See the section below titled "Entering or Importing Data" for information on how to import data into a KWIKSTAT database. Report - Allows you to produce a report using the data in your database. See the section titled "Printing a Report" later in this chapter Sort - (Professional Edition Only) Allows you to sort your database in ascending or descending order using any field in the database. EXIT KWIKSTAT Use this option to end the KWIKSTAT program. USING THE KWIKSTAT EDIT MENU The EDIT menu allows you to modify and manage data in a KWIKSTAT database. Often, after you have created a database, you need to add new data or modify the current data. You can also add new fields, calculate new variables and delete fields or records. This section describes the options you can access from the EDIT menu. EDIT RECORDS, ADD, DELETE OR REPLACE FIELDS When you choose the Edit records option, the KWIKSTAT data editor will appear. The data editor is described in detail in the section titled "Using the KWIKSTAT Data Entry Screens" later in this chapter. APPEND RECORDS, FROM KEYBOARD OR FILE... The Append records option allows you to add new records to an existing database. You have three data entry options: Enter data from keyboard - Allows you to enter data by typing it on the keyboard. Enter data from a text file - Allows you to import data from an ASCII file. See the Section titled "Entering and Importing Data" later in this chapter. Append data from a dBASE file - Allows you to append data from another in this chapter. -------------------------------------------------------------------- Please Become a Registered User 18 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- MISSING VALUES CODES Sometimes in the collection of data there are values that are lost or cannot be gathered. These are called "missing values". When such values occur, it is important for the program to know that the values are missing so that statistical calculations may take this into account. Missing values are usually designated as an impossible value. For example, the missing values designated for the variable AGE may be -9, since it is impossible for the variable AGE to have the value -9. When the program is asked to calculate the mean of age, for example, it will ignore those records where AGE is -9 in that calculation if -9 has been specified as the missing value code. In most KWIKSTAT procedures, there is a casewise deletion of the record from calculation whenever a missing value is encountered. Once you designate a missing value code for a variable, it is up to you to make sure that this code gets placed into your database in the proper records and fields. For example, if you have designated -9 as the missing value code for AGE, you must make sure that in your database a -9 appears in the field AGE if that data is missing or unknown. The Indicate missing value codes option on the FILE Menu is used to set up these values. When this option is selected, the program will display an entry screen that is similar to a data entry screen. You may enter one missing value for each field name. The missing value must obey the definition of the field in terms of length and type. Once missing values are entered, they are stored on disk in a file named filename. MV, where "filename" is the name of the designated database. If a new variable is created using the transformation procedure, its missing value is appended to the missing value file. You may change or correct the missing values for a database at any time by calling up this option. If missing values are already designated for the database, they will be displayed on the entry screen, and you may edit them or accept them as they are. Note: If missing values are NOT used, and there is a blank numeric variable in a calculation, it may be treated like the value 0 (zero), so it is important to use missing values if your data contains such entries. Otherwise, the statistical calculations may be in error!! PACK DATABASE The records marked for delete are not actually erased from the file at this point. However, they will be ignored in most analyses, and will continue to be displayed when you edit the database. See "Edit Mode Options" later. -------------------------------------------------------------------- Please Become a Registered User 19 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- If you want to permanently get rid of the records you have marked for delete, choose the Pack procedure from the FILE Menu. This procedure erases all "deleted" records from the database. ZAP - GET RID OF ALL RECORDS The Zap option in the EDIT menu allows you to quickly erase all records from a database. To use this option, open a database, then choose Zap. USING THE KWIKSTAT DATA ENTRY SCREENS If there is a need to change data already in a database, you may choose the Edit records option from the EDIT menu. Editing is similar to entering data. Use the up and down arrow keys to move from field to field within a record and type in any change you want to make in the field. The section below describes the menu bar options available to you in the data entry screen. USING APPEND & ENTRY SCREEN MENU BAR OPTIONS When you are appending information to the database, there are several function key options that you can choose. These options are listed at the bottom of the entry screen. APPEND MODE OPTIONS To choose an option, press the function key related to the option, or point to the option with the mouse and click. The options available are: F1 Help - Displays the KWIKSTAT Help menu. F2 Edit - Toggles between edit mode (correct current entries) and append mode (add new entries). F7 Exit- Exits entry mode and returns you to the main KWIKSTAT menu. F8 Switch - Switches between spreadsheet type entry and database entry mode. F10 Print - Print the current record to the printer or to a file. EDIT MODE OPTIONS When you are in edit mode, a slightly different menu bar appears containing the following items: F1 Help - Displays the KWIKSTAT Help menu. F2 Append - Toggles between edit mode and append mode. -------------------------------------------------------------------- Please Become a Registered User 20 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- F3 Delete - Marks one or more records for deletion. When you choose this option you will be able to choose from the following options: A) Mark this record for delete B) UNMark all records where fieldname = value C) Mark all records for delete where fieldname = value D) UNMark a range of records E) Mark a range of records Q) Quit this option You can mark as many records as you choose. Once you have marked the records for delete, you can pack the database using the Pack option in the Edit pull-down menu. ANALYSIS TIP: Marking records for delete is a quick and simple way to do an analysis on a portion of your database. Mark the records you want to eliminate from an analysis, then perform the analysis. KWIKSTAT will ignore the deleted records. Later, you can undelete the records. Note: A ^U (Ctrl-U) also deletes and single undeletes records. Place your cursor on the record to delete or undelete, and press ^U. F4- Erase/Insert - the current record permanently from the database. (Only in spreadsheet entry mode.) When you choose this option the following menu items will appear: A) Erase records beginning with record # B) Insert blank records before record # Q) Quit this option To erase one or more records, display the record in Edit mode. Highlight the record to erase, and press the function key F4. You can then choose to erase the single record or a range of records. F5 Goto - Go to a record number. F6 Undo - Returns last record changed to its previous values. F7 Exit - Exits entry mode and returns you to the main KWIKSTAT menu. F8 Switch - Switches between spreadsheet type entry and database entry mode. F9 Field - Insert or Delete a field in the database or Replace the contents of a field. See "Creating New Fields and Replacing the Contents of a Field" below. F10 Print - Prints the contents of the current record to a printer or -------------------------------------------------------------------- Please Become a Registered User 21 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- file. CREATING NEW FIELDS AND REPLACING THE CONTENTS OF A FIELD Using the F9 Field option in the Spreadsheet data entry screen, you can create new blank fields of any field type, and place information in those fields that is either a numeric or character expression. The option menu displayed when you choose F9 Field is: A) Insert a new field after the field: fieldname B) Delete the field named: fieldname C) Replace contents of field: fieldname D) Set missing values for fields Q) Quit this option The following sections "Creating a New Field" and "Replacing the Contents of a Field" describe these procedures. CREATING A NEW FIELD You may create a new field in a database within an edit screen by choosing the F9 (FIELD Insert) option. After creating a new field, you can then use the F9 (FIELD Replace) option to place a value in the new field. When you choose the Field/Insert option in the edit screen (F9), you will be prompted to enter information about the new field. Define a name for the new field Define the field type Define a width for the new field For numeric variables, Define the number of decimals, if any Define a missing value code. If none is selected, it is assumed to be 0 (zero). After entering a new name for the field, you will be prompted to enter the field type, width and decimals (if numeric). For example, if your new field is numeric with a width of 8, and 2 decimal places, you will enter N,8,2 If the field is Character with a width of 3, you would enter C,3 After you enter the type and width information, you will be asked if you want to enter missing value codes. -------------------------------------------------------------------- Please Become a Registered User 22 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Note: All of the normal restrictions of defining a field remain. TIP: To create a new field containing a new value that is a numeric transformation of other fields, first insert the new field using the F9 Field/Insert option, then use the F9 Field /Replace option to place the value in the new field. REPLACING THE CONTENTS OF A FIELD (TRANSFORMATIONS) You can use the F9-Field Replace option in the Edit screen to replace the existing contents of a field, or place new information in a newly created blank field. KWIKSTAT provides a number of numeric and character functions to enable you to do this. For example, if you wanted to replace the contents of the field NEW with the values TIME1/AGE: Step 1 Highlight the field to replace: In the edit mode, highlight the field whose contents you want to replace. Press the F9 (Field) option, and choose "Replace the Contents of a Field" option from the Field menu. A dialog box will appear. Step 2 Specify which records to replace: The default is ALL, which means all records in the database. Or, enter a range such as 1-20, which would mean only perform the replacement in records numbered 1 through 20. Then, press Enter. Step 3 Specify what to place in the field: For example, enter the formula TIME1/AGE in the Replace With entry field, where TIME1 and AGE are two other fields in the same database. Step 4 Specify a condition for replacing (if any): The default is NONE. For example, if you only want the replacement to be for records whose value of AGE is greater than 20, you would enter the expression AGE>20 in the condition entry field. Step 5 Begin the replacement: Press F7 when you have finished entering the Replace information, and the replace will begin. When it is finished, you will return to the edit screen. USING DATABASE AND MATHEMATICAL EXPRESSIONS The kinds of expressions you can use in Subset and Replace options are described below. KWIKSTAT supports two kinds of expressions. 1. Database expressions (most normally used.) 2. Mathematical expressions (signaled with an = preceding the expression.) -------------------------------------------------------------------- Please Become a Registered User 23 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- USING DATABASE EXPRESSIONS Database expressions allow the use of common character, numeric, date and logical fields in the expression. Use the mathematical expression only when you must use mathematical functions not in the normal database expression list. Here are the usage criteria: In a REPLACE WITH field: Use either a database expression or a math expression. In a CONDITION field: Use only a database expression. Most expressions can be handled with the database expressions. If you find that you cannot create an expression using the database functions, go to the section titled "Using Mathematical Expressions" later in this chapter. The following information on how to use expressions is useful for both the database and mathematical expression types: Arithmetic operators: Add + Subtract - Divide / Multiply * Exponentiation ^ (Mathematical expressions only) For Character fields, the database calculator supports the operation: Add + (appends one string to another) Following are a few examples of correct expressions: AGE/HEIGHT LTRIM(FIRST)+' '+LAST (AGE*TIME1)+3.2 Note: Literal strings included in expressions must be surrounded by single quotes. For example, 'Hello' is a literal string. Character field names are used without quotes. For example, NAME is a field name. A correct string expression using these two strings would be: 'Hello '+NAME DATABASE CALCULATOR FUNCTIONS SUPPORTED The following functions may be used in expressions both in the "Replace With" and "Condition" fields. In this table the arguments have the following meanings -------------------------------------------------------------------- Please Become a Registered User 24 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- NUM - Numeric argument STG - String (Character) argument DATEFMT - Date argument, MM/DD/YY LOGICAL - T or F LEXP - Logical Expression AEXP - Any Expression | means OR, choose one or the other option. [ ] means optional argument Database Calculator Functions Name Meaning Example of use Type ------- ------------------------------------------------ ABS Absolute value ABS(NUM) N ASC Ascii value ASC(STG) N AT AT Find AT(STG1,STG2) N CALENDAR Number to Date CALENDAR(NUM) D CAPS First Letter Cap CAPS(STG) C CHR Number to String CHR(NUM) C DATE System Date DATE() D DELETED Is record Deleted DELETED() L IIF Logical If IIF(LEXP,AEXP1,AEXP2) CNL INT Integer Round INT(NUM) N JULIAN Date to Number JULIAN(DATE) N LEFT Left string LEFT(STG,NUM) C LEN String Length LEN(STG) N LOWER Lower Case LOWER(STG) C LTRIM Trim Left LTRIM(STG) C MAX Max of 2 Nums MAX(NUM1,NUM2) N MIN Mini of 2 Numbers MIN(NUM1,NUM2) N REPLICATE Repeat String REPLICATE(STG,NUM) C RIGHT Right String RIGHT(STG,NUM) C RTRIM Trim Right RTRIM(STG) C SPACE Space SPACE(NUM) C STR Number to String STR(NUM) C STRING Create String STRING(NUM,NUM|STR) C RIGHT Right String RIGHT(STG,NUM) C STUFF Stuff String STUFF(STG,NUM,NUM,STG2)C SUBSTR Extract String SUBSTR(STG,NUM,[NUM]) C TIME System Time TIME() C TRIM Trim blanks TRIM(STG) C UPPER Upper Case UPPER(STG) C VAL String to Number VAL(STG) N The following functions are supported only in the "Replace With" entry field, and only for numeric field types. You MUST precede expressions using these functions with an = sign. An example of the RECODE function, which appears on the following table is: -------------------------------------------------------------------- Please Become a Registered User 25 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- =RECODE(SCORE,1,AGE,10,15) The five arguments in the RECODE function are: No. Example Meaning 1 SCORE Field to use in compare 2 1 Value to assign if comparison is true 3 AGE Value to assign if comparison is false 4 10 Low range of field to compare 5 15 High range of field to compare Thus, this example means that the value of the RECODE is 1 if SCORE is between 10 and 15, else the value is the current value of the AGE field for that record. USING MATHEMATICAL EXPRESSIONS In the REPLACE WITH field, the default expression type is the database type. In order for an expression to be evaluated as a strictly math expression, you must place an equal sign "=" at the beginning of the expression. The major difference between the database and mathematical expression types are their capabilities. The database expression can handle most common calculations, including simple math, string evaluation, and date evaluation. The math expression can be used only for strictly numeric calculations using one or more of the functions listed in the table below, or that uses the exponentiation operator. For example, if you want to perform the calculation WEIGHT/HEIGHT, you can enter the expression as-is in the REPLACE WITH field. However, if you want to calculate the log of WEIGHT/HEIGHT, you must enter the expression as =LOG(WEIGHT/HEIGHT) since the LOG function is not supported as a database expression function. The equal sign signals to the program to use the math calculator. For example, if you want to create a field that contains the record number, you would use the expression =RECNO To create a field containing a random number from 0 to 100, you would use the expression =RAND*100 -------------------------------------------------------------------- Please Become a Registered User 26 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Math Calculator Functions Name Meaning Example of use -------- ---------------------------------- ABS Absolute value ABS(SCORE) AVE Average (Mean) AVE(LIST)) ACOS Arc Cosine ACOS(SCORE) ASIN Arc Sine ASIN(SCORE) ATAN Arc Tangent ATAN(SCORE) ATAN2 Arc Tangent y/x ATAN2(y,x) CSC Cosecant CSC(SCORE) COS Cosine COS(SCORE) COT Cotangent COT(SCORE) EXP Exponentiation EXP(SCORE) INT Integer INT(SCORE) LN Natural Log LN(SCORE) LOG Log base 10 LOG(SCORE) MAX Maximum of list MAX(1,T2,3) MIN Minimum of list MIN(1,T2,T3) MOD MOD of number MOD(9,2) is 9 mod 2 PI PI PI = 3.14159265358979 RAND Random number number between 0 and 1 RECNO Record number database record number RECODE Recode number RECODE(SCORE,1,0,1.1,2,2) ROUND Rounds a number ROUND(1.236,2)=1.24 SD Standard Deviation SD(LIST) SEC Secant SEC(SCORE) SIN Sine SIN(SCORE) SQRT Square root SQRT(SCORE) SUM Sum of list SUM(1,2,3) = 6 TAN Tangent TAN(SCORE) ENTERING & IMPORTING DATA INTO KWIKSTAT When you choose the Append records... option from the FILE Menu, you will be asked to specify entry from the keyboard or from a file (ASCII file). For most small data sets, you will probably enter data from the keyboard. If your data is already in another program that supports ASCII, dBASE and 1-2-3 type files, you may be able to import the data from that program into KWIKSTAT. The following information describes how to enter data from the keyboard, from an ASCII file or from other programs. Once you have opened or created a database, you can enter data from the keyboard by choosing the Append Records/from keyboard option from the EDIT pull-down menu. When you choose this option, a sub-menu will appear allowing you to choose to enter data from the keyboard, from an ASCII file or from a DBF file. -------------------------------------------------------------------- Please Become a Registered User 27 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- APPENDING DATA FROM ANOTHER DBASE FILE If you have a dBASE file containing data that you want to append to a current dBASE file, use the following procedure. Note: For this procedure to work, the names of the fields in the two databases must be the same. Only fields with the same name will be imported. Step 1 Open a database: Open a dBASE file by choosing the Open a Database option on the FILE Menu. Step 2 Choose Data Entry: From the EDIT menu, choose the Append Data, from dBASE option. Step 3 Specify the name of the dBASE file to read from: Enter the name of the dBASE file containing the records you want to append. KWIKSTAT will read the data, and will append data from the new database based on the fieldnames in the currently opened database. Step 4 Verify the import: After appending, you should perform a list to verify that the database contains the information you want. USING LOTUS 1-2-3 TYPE FILES If you have a Lotus 1-2-3 WKS or WK1 file and cannot use the Lotus program to translate it, you can use the KWIKSTAT import feature from the File Utilities/Import option from the FILE Menu. Note: Import will allow you to import a maximum of 128 fields. USING COMMA DELIMITED ASCII FILES If your program outputs comma delimited ASCII files, that is, there is a comma between each field, KWIKSTAT can import this data using the "Comma Delimited" option in the File Utilities FILE Menu option. The data to be imported can contain numbers and character fields. Character fields must be enclosed in quotes "". An example file on disk is EXCOMMA.DAT. The first few lines of this file are: "A",12,22.3,25.3,28.2,30.6,5,"Text" "A",11,22.8,27.5,33.3,35.8,5,"Text" "B",12,22.8,30.0,32.8,31.0,4,"Text" "A",12,18.5,26.0,29.0,27.9,5,"Text" The import procedure looks at the first line of the file to determine how many fields to create. This file has 8 fields. The first and last are character. The fields will be named VAR1, VAR2, etc. You can change these names in the "modify database" option, main menu. The -------------------------------------------------------------------- Please Become a Registered User 28 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- import will attempt to create widths that will allow full storage of numbers and text. Everything after that is automated. Look at the file with the List option to verify that the imported data is correct. EXPORTING DATA You may output the data from your KWIKSTAT (DBF) file into a standard ASCII TEXT file. (Often called an SDF file - Standard Data Format file.) Outputting the data is useful for transferring your data to other programs. Along with the output of data, you may also output a "format" file, which describes the contents of the text file. This file can be output in "dBASE" style or "SAS" (Statistical Analysis System) style. The SAS format could be used in a SAS INPUT statement to read the ASCII data file into the SAS program. To export data, choose the File Utilities option from the DATA pull-down menu to display the KWIKSTAT. From the Utilities menu, choose the "Output data to an ASCII file (SDF Standard Data Format)" option. PRINTING A REPORT You may output a listing of the data in the dataset (or a selected subset of the database) by using the report facility. To use this procedure, choose the "Report: Output data in a report format" option after choosing the File Utilities option on the FILE Menu. In this procedure you may specify the following report features: Which Data Fields To Output Output Record Number As A Column Title Number Of Lines Per Page Width Of Page (default is 80) Output To A File Or Printer Output A Subset Of The Data (search) NOTE:You may want to place a coded variable in your data set which will allow you to easily select a subset of data to output. Subset searches can be: 1) Exact: case is ignored. 2) First one or more letters in a field: (AL* matches ALLEN, ALBERT, etc) 3) Keyword: match a letter pattern within a field (i.e., [AL] matches ALLEN, BALES, etc). The REPORT procedure is menu driven. Simply answer the questions as you are prompted. If the report is too wide to fit on a single width -------------------------------------------------------------------- Please Become a Registered User 29 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- of the specified paper width, the report will be printed in parts. -------------------------------------------------------------------- Please Become a Registered User 30 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- ===================================== BASIC STATISTICAL ANALYSIS PROCEDURES ===================================== This section of the KWIKSTAT manual describes the statistical analysis procedures available in the BASIC KWIKSTAT edition. The data generation and simulations module contains several examples of statistical concepts such as a coin flip, sampling from a distribution, and confidence intervals. USING DESCRIPTIVE STATISTICS The Descriptive Statistics module allows you to examine summary statistics of the data in a database. DETAILED STATISTICS FROM AN EXISTING DATABASE This option calculates the mean, standard deviation, median, standard error of the mean, minimum, maximum, sum, variance and other descriptive statistics for a single variable (field) from a set of data. If your data is already in a database, perform the analysis using the following steps. For example, suppose you want to calculate statistics for the TIME1 field in the EXAMPLE database. Step 1 Open the database: Choose Open Database from the FILE pull-down menu. Select the EXAMPLE database. Step 2 Choose Analysis option: Choose the Descriptive Statistics option from the ANALYZE menu. Step 3 Choose the analysis type: Choose the "Detailed Statistics" option from the Descriptive Statistics menu. Step 4 Choose the field to analyze: Choose the TIME1 field. A screen will appear displaying statistics on that variable. DEFINITIONS C. I. - Confidence interval - This is a range that describes (with some confidence -- usually 95% confidence) where the actual mean of the data probably lies. That is, the true mean of the data shown above is somewhere between 20.79 and 21.23, with 95% confidence. MAXIMUM - The largest number. MEAN - A measure of central tendency. The arithmetic average. For example if you average the three grades 82, 100 and 88 (82+100+88)/3 = -------------------------------------------------------------------- Please Become a Registered User 31 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- 90 -- the average (or mean) is 90. MEDIAN - A measure of central tendency. The mean is a statistic such that 50% of all numbers in the sample are above the mean and 50% are below the mean. For example, in the list 1, 2, 3, 4, 5 the median would be 3. MINIMUM - The smallest number. MISSING - Reports how many numbers had a missing value code. N - How many numbers were used to calculate the statistics. PERCENTILES - Tells you what percent of numbers are lower than the percentile. For example, the 50th percentile is the median. S.E.M. - The Standard Error of the Mean measures the spread of the data around the mean value. ST. DEV. - Standard Deviation - measure of the spread of the data. It is calculated two ways, using n-1 as a divisor and using n as a divisor. Usually, most people use the n-1 version. SUM - The total of all the numbers added together TEST FOR NORMALITY - (Professional edition) - This is a test that the data is normally distributed. The test statistic is D. If the p-value is < 0.05, there is evidence to assume that the data are NOT normal. TUKEY 5 NUMBER SUMMARY - Essentially, the 0th, 25th, 50th 70th and 100th percentile. See the Hoaglin, et al. reference. VARIANCE - A measure of the spread of the data. SUMMARY STATISTICS ON A NUMBER OF VARIABLES This option allows you to calculate statistics on several variables (sample size, mean, standard deviation, minimum, maximum, and standard error of the mean). If you have a grouping variable in your database, you may request output of summary statistics by group. Suppose you want to know the means of all the quantitative variables (AGE, TIME1, TIME2, TIME3, TIME4, STATUS) within each of the three groups (A, B, C) in the EXAMPLE database. Follow these steps: Step 1 Open the database: Choose Open Database from the FILE pull-down menu. Select the EXAMPLE database. Step 2 Choose Analysis option: Choose the Descriptive Statistics option from the ANALYZE menu. -------------------------------------------------------------------- Please Become a Registered User 32 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 3 Choose the analysis type: Choose the "Summary Statistics" option from the Descriptive Statistics menu. Step 4 Choose the fields to analysis: Select the fields AGE, TIME1, TIME2, TIME3, TIME4, then select the "Finish choosing fields" option. Step 5 Choose the grouping field: Select STATUS as the grouping field. The KWIKSTAT viewer will appear displaying statistics on that variable. Step 6 Exit the viewer: Then exit the module or perform another analysis. DETAILED STATISTICS FROM DATA ENTERED BY COUNTS If you have a small amount of data or if your data is grouped so that you know how many of each number you have (i.e., you have 12 people 13 years old, 5 people 14 yrs old, 6 people 15 yrs old, etc.) you can enter the data at the keyboard. When you choose this option a screen will appear allowing you to enter your data. Either enter a single value or a value followed by a comma and the number of times that number should be used. For example, entering 34,5 would mean that you are entering 5 values of 34. When you enter nothing on a line, this signals that you are finished entering the data. A detailed statistics screen will appear similar to the one described above. APPROXIMATE P-VALUE DETERMINATION This option calculates p-values for four test statistics: normal (z), student's t, F, chi-square. Enter the statistic, degrees of freedom and the calculated value of the statistic, and the program will tell you the p-value associated with that statistic. To calculate a p-value, follow these steps: Step 1 Begin the Descriptive Statistics module: From the main menu bar, open the ANALYZE pull-down menu, then choose the Descriptive Statistics option. Step 2 Choose the p-value option: Select Approximate p-value determination. Step 3 Enter the p-value information: You will be prompted to enter an equation. On the left hand side, designate the test statistic being used, i.e., z= , t(df)= , F(dfn,dfd)= , or X(df)=. In parentheses, as shown by (df), enter the appropriate degrees of freedom. On the right hand side of this equation, enter the calculated value of the statistic you wish to know the p-value. For example: -------------------------------------------------------------------- Please Become a Registered User 33 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- t(20)=2 means you want the two-sided p-value for a calculated t-statistic of 2.00 with 20 degrees of freedom. In this case, the result is p=0.059. Step 4 Exit p-values: To exit the p-value determination procedure, enter End. STEM AND LEAF DISPLAY The Stem and Lead Display is a graph created from a series of numbers. The Stem part of the display is the leading digit for the data (such as 5 in 54) and the leaf is the trailing digit (such as the 4 in 54). When larger numbers are used, the rightmost digits are often ignored. For example, if the numbers range from 241 to 845, the stem might be the 2 to 8, representing 200 to 800, and the leaf would be 0 to 9, representing the 10's. The 1's place would be ignored. KWIKSTAT gives you options for choosing the magnitude of the stem and leaf values. DESCRIPTIVE AND COMPARATIVE GRAPHS The Graphs - Descriptive and Comparative module allows you to create a number of different charts and graphs. CREATING A BAR, LINE OR AREA CHART The Bar/Line Area chart option allows you to create a graph using any combination of these kinds of charts. You can also choose other options for this graph as discussed later in this section. The following two examples show you how to display a bar chart by creating a new database using a current database. BAR CHART EXAMPLE 1 Step 1 Create the database: Create the database and enter the data. Your database should contain a Label field and a Value field (the value field contains the numbers to use for the plot.) You can use the "SIMPLE BAR CHART" pre-defined structure when creating a new database for this graph. For this example MAGNET is the LABEL field and NAILS is the VALUE field. ----These are the fields.----- RECORD LABEL VALUE ------ ------ 1 SMALL 31 -+ 2 MEDIUM 38 |--- This is the data to plot. 3 LARGE 51 -+ | +----------------- These are the plot labels. -------------------------------------------------------------------- Please Become a Registered User 34 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 2 Enter the Data: Enter the 3 records shown above. Step 3 Choose Analysis option: Choose the Graphs option from the ANALYZE menu. Step 4 Choose the analysis type: Choose the "Bar/Line/Area" option from the Graphs menu. Step 5 Choose the field to graph: Since there is only one numeric field, KWIKSTAT automatically select the Data field. The resulting chart contains 3 bars labeled SMALL, MEDIUM & LARGE. (You could also use this same data to create a pie chart. BAR CHART EXAMPLE 2 This example will use data already in the database named BARCHART. When creating a bar/line/area chart, you can choose more than one value field and create a side-by-side bar chart, a stacked bar chart, a line chart, an area chart, a point chart, or a chart containing a mixture of these types. The data in this example will be used to create a chart that includes 2 bars and a line graph. Step 1 Open the database: Open the database named BARCHART. Step 2 Choose fields: Choose the fields VAR1, VAR2 and VAR 3 as the data fields, then choose "Finished Choosing Fields." Select LABEL as the label field. A chart will appear containing three side-by-side bars. Step 3 Choose <options> from the graph menu: An option screen will appear that allows you to enter a title, footnote and other options. At the bottom of the screen choose <PgDn-Next>. This will display the second option screen. See "Options While Displaying a Bar Chart" below. Step 4 Choose types: For Var3 choose the W (Wide Line) type by entering a W in the Type column or by clicking on the option until a W appears. Step 5 Display plot: Choose (F7-Continue) to display the plot. You can use the <get color> option to select whether the bars will use colors or tile patterns. Your chart can contain any combination of bars, lines, area and point charts. While displaying a bar chart, you can choose from a number of options. Some of these options are: -------------------------------------------------------------------- Please Become a Registered User 35 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- <options> from the menu at the top of the screen, and a two-screen option form will appear. On the first screen of the form, you can specify the axes labels, footnotes, and whether the bars should be stacked. Select the <Next> option to display the second screen option form. On this form you can specify : Legend - For each bar/line etc. Type - choose from Bar, Line, Wide Line, Area or Point Cumulative - Yes or No Color or Patterns - Choose from 14 colors or 10 patterns. Patterns are displayed if the Tile option has been chosen from the main graph menu. Display Counts - Yes or No By clicking on the the Type, Cumulative or Display counts option, the options will cycle through the list of options. For Legend, Colors or Patterns, enter the desired or color or pattern number. After making changes in the options, display the chart by selecting <<F7 - Continue>> option. CREATING A PIE CHART: A pie chart is created from a list of counts. (See Help) CREATING A TIME SERIES/LINE PLOT: A time series plot is useful in examining data that are time related, such as profit by month, etc. (See Help) CREATING AN XY PLOT (SCATTERPLOT): An XY plot (scatterplot) displays the relationship between two variables. (See Help) PRODUCING A HISTOGRAM: A histogram can be helpful in determining if the distribution of a continuous variable is approximated by a normal distribution. (See Help) DISPLAYING BY-GROUP PLOTS: By Group plots allow you to display graphs that show comparisons across groups. (See Help) Types of plots you can display in by-group plots include: A) Box Plot Comparison B) Dot Plot Comparison C) Mean and Error Bar Comparison -------------------------------------------------------------------- Please Become a Registered User 36 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- 3-D SCATTERPLOT/ SPIN PLOT The KWIKSTAT "SPIN" module allows you to interactively view three dimensional (XYZ) scatterplots and allows you to rotate the plot along the x, y, or z axes, spin the data, and choose other display options. CREATING A SPIN PLOT The Spin Plot procedure expects your database to contain at least three numeric fields and (optionally) a group field. From these fields, it can display a three dimensional XYZ plot. For example, using the CAR database, follow these steps to create a graph: Step 1 Open the database: From the main KWIKSTAT FILE menu, open the database named CAR. Or, if you are already in the SPIN module, open the database by selecting the "ChooSe a database to open" option. Step 2 Begin the Spin module: If you have not already done so, begin the Spin plot module by choosing the Spin Plot option from the ANALYZE pull-down menu. The Spin Plot module menu will be displayed. Step 3 Select the plot type: From the Spin Plot menu, choose choose the "3D XYZ Data visualization plot." Step 4 Select the fields to plot: A pick list of fields will appear allowing you to choose three data fields, and optionally, choose a grouping field. For the CAR database, choose the following fields (in this order) MPG, WEIGHT and HP. For the grouping field, choose CYLINDERS. Note: When a grouping variable is used, the points on the plot will be displayed where points for each different group (up to 10 groups) will appear in different colors or points for different groups will be displayed using a different shaped point (such as a circle, square, diamond, etc.) Once you have chosen the fields, an initial plot appears. Step 5 Spin the Plot: To examine the relationships between the variables, spin the plot on one or more axes to view the relationships from different angles by pressing an arrow key or PgUp or Pg Dn or click on the Spin menu options. SPIN PLOT MANIPULATION OPTIONS Manually rotate the plot (up-, down-, left- and right-arrow, PgUp and PgDn keys) by choosing Roll, Pitch or Yaw. Either point to one of these options with the mouse and click, or press the designated keyboard button to choose one of these options. Click and hold on a spin option and the plot will continue spinning until you let up on -------------------------------------------------------------------- Please Become a Registered User 37 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- the mouse button. Automatically spin the plot by pressing CTRL, plus a Roll, Pitch or Yaw button (i.e., CTRL-rightarrow). Stop the spinning plot by pressing a Roll, Pitch or Yaw button (without CTRL). Grow or Shrink: + or - Increase or decrease the degree of move: > or < USING T-TESTS AND ANOVA PROCEDURES T-tests and Analysis of Variance (ANOVA) procedures are used to test hypotheses about population means using data obtained through random sampling of those populations. EXAMPLE: TWO SAMPLE T-TEST (INDEPENDENT GROUPS) The data used here are heights of 13 plants grown using two different fertilizers. Suppose you want to know if there is a difference in the average heights of plants in the two treatment groups. Data for independent group t-test (fertilizer study) Present Newer Fertilizer 46.2 cm 51.3 cm 55.6 52.4 53.3 54.6 44.8 52.2 55.4 64.3 56.0 55.0 48.9 Step 1 Create a database: See the tutorial example earlier for information on how to enter this data into a database. Step 2 Enter the data into the database. Step 3 Select the analysis to perform: Choose the t-tests and Analysis of Variance (ANOVA) option from the ANALYZE pull-down menu, then choose the "Compare independent groups (t-test, ANOVA)." Step 4 Select the fields to use: Select GROUP as the grouping variable and OBS (Height) as the data (response) variable. The results appear on the screen. Step 5 Analyze the results: A menu will appear allowing you to view the results or display a graphical comparison. Choose to view the results. Exit the viewer by choosing F7-Exit. -------------------------------------------------------------------- Please Become a Registered User 38 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 6 View the graphical comparison: Select "Graphical Comparison" from the options menu to display a comparison of the results. This is the same graph described earlier in the section "Displaying By Group Plots." Initially, this plot shows an error bars comparison. Exit the viewer by choosing the <Exit> option. EXAMPLE: SINGLE FACTOR ANOVA When more than two independent groups are compared with respect to one variable, one-way or single factor analysis of variance techniques are appropriate. This example uses data for hogs which have been randomly assigned to four groups, with each group being given a different feed. The response is weight gain. Data for Independent Group ANOVA Gp 1 Gp 2 Gp 3 Gp 4 60.8 78.7 92.6 86.9 67.0 77.7 84.1 82.2 54.6 76.3 90.5 83.7 61.7 79.8 90.3 The database to analyze this data is similar to the one used for the t-test example above, differing only with respect to the number of groups. In fact, this one-way ANOVA is an extension of the t-test when there are three or more groups. To perform this analysis, use these steps: Step 1 Create or Open a database: Create a database using the "FOR INDEPENDENT GROUP T-TEST OR ANOVA" pre-defined structure. GROUP will be the grouping field. The groups will be numbered 1,2,3,4 according to the type of feed used. The response field will be OBS. Or, open the database named HOG, and skip to step 3. Step 2 Enter the data: The data, as entered into the database will look like this: -------------------------------------------------------------------- Please Become a Registered User 39 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- RECNO GROUP OBS (WEIGHT) 1 1 60.8 2 1 67.0 3 1 54.6 4 1 61.7 5 2 78.7 6 2 77.7 7 2 76.3 8 2 79.8 9 3 92.6 10 3 84.1 11 3 90.5 12 4 86.9 13 4 82.2 14 4 83.7 15 4 90.3 Step 3 Select the analysis to perform: Choose the t-tests and Analysis of Variance (ANOVA) option from the ANALYZE pull-down menu, then choose the "Compare independent groups (t-test, ANOVA)." Step 4 Select the fields to use: Select GROUP as the grouping field and OBS (Weight) as the response variable. The results appear on the screen. Step 5 Analyze the results: The results of this test are summarized in the p-value. In this case, the small p-value (p<<0.001) means that there is a significant difference between groups. This is taken as evidence of a "real" difference between feeds, a difference not due to chance. The ANOVA tells you only that there is a difference among the feeds. In order to find out which groups are significantly different from which others, examine the multiple comparison results. The Newman-Keuls multiple comparison test (or whichever you specified at setup) will describe which of the means are significantly different from which others (at the 0.05 significance level). One part of the output is the multiple comparison test. The test performed depends on what option you chose during setup. This example will describe the results of the Newman-Keuls test. The results of this test are as follows: Gp Gp Gp Gp 1 2 4 3 ------- ---- ---- The group numbers are given in increasing order of the value of their group means. That is, Group 1 has the smallest mean, Group 3 the -------------------------------------------------------------------- Please Become a Registered User 40 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- largest. At the 0.05 significance level, the means of any two groups underscored by the same line are not significantly different. This display tells you that (at the 0.05 significance level): 1) The mean for group 1 (feed 1) is statistically significantly less than the means for all other groups. 2) The mean for group 2 (feed 2) is significantly greater than the mean for group 1, and significantly less than the means of groups 4 and 3. 3) The means for groups 4 and 3 are not significantly different from each other, but they are both significantly greater than the means of groups 1 and 2. You can conclude that feeds 3 and 4 are better than feeds 1 and 2, but there is not enough evidence to say that either feed 3 or 4 is the best overall. Step 6 View the graphical comparison: Select "Graphical Comparisons" from the options menu to display a graphical comparison of the results. This is the same graph described earlier in the section "By Group Plots." Initially, this plot shows a comparison using error bars. This figure shows a box plot comparison. From this plot you can visually see that groups 3 and 4 are similar and that group 1 is much lower than the rest. Exit the viewer by choosing the <Exit> option. PARAMETRIC REPEATED MEASURES (PAIRED) ANALYSIS Repeated measures are observations taken on the same or related subjects over time or in differing circumstances. Examples would be weight loss, or reaction to a drug across time. EXAMPLE PAIRED T-TEST The data in this example are before and after weights for eight persons on a diet. Notice that in this case, both data values are taken from the SAME entity (person). Follow these steps to perform this analysis: Step 1 Create or Open the database: Use the pre-defined database structure named "FOR PAIRED T-TEST OR McNEMAR's TEST." This will create a database with the fields REP1 and REP2. The REP1 will be used for Before and REP2 will be used for After. Or, open the database named DIET, and skip to step 3. Data for paired t-test -------------------------------------------------------------------- Please Become a Registered User 41 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Person Before After 1 162 168 2 170 136 3 184 147 4 164 159 5 172 143 6 176 161 7 159 143 8 170 145 Step 2 Enter the data: Enter the data for the eight records. The database should look similar to the listing of the data above. Step 3 Choose the analysis: Choose the t-tests and Analysis of Variance (ANOVA) option from the ANALYZE pull-down menu. Then choose the "Compare repeated or paired data (t-test, ANOVA)" option from the module menu. Step 4 Select the fields to use: Select REP1 (BEFORE) as the first field and REP2 as the second field. Step 5 Analyze the results: Choose to View the results from the options menu. Select "Graphical Comparisons" from the options menu to display a graphical comparison of the results. EXAMPLE ONE-WAY REPEATED MEASURES ANOVA For more than a pair of repeated measures on the same subject, a one-way repeated measures analysis of variance is appropriate. The data in this example are repeated measures of reaction times of five persons after being treated with four drugs in randomized order. One-way repeated measures ANOVA data Person Drug 1 Drug 2 Drug 3 Drug 4 1 31 29 17 35 2 15 17 11 23 3 25 21 19 31 4 35 35 21 45 5 27 27 15 31 The results of this ANOVA are summarized in the p-value. In this case, the small p-value (The (p=)0.000 on the "Repeated Factor" line in the ANOVA table.) means that there is a statistically significant difference in the mean response times for the four drugs. The Newman-Keuls multiple comparison test (or whichever multiple comparison test you chose at setup) describes which of the means are significantly different from which others (at the 0.05 significance level). See previous ANOVA example. -------------------------------------------------------------------- Please Become a Registered User 42 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- SINGLE SAMPLE ANALYSIS This option allows you to choose a single variable, and test a hypothesis that the mean differs from an hypothesized mean. You must enter the hypothesized population mean. DUNNETT'S TEST Dunnett's test is a multiple comparison procedure following a one-way ANOVA that compares a control mean with the other means in the analysis. NON-PARAMETRIC PROCEDURES Non-parametric procedures are appropriate when the assumption of normality cannot be made for a small data set or when a large data set is known to be from a non-normal population. Non-parametric procedures are generally based on ranks rather than actual data values, so these procedures can be useful also when actual data values are not known, but the order or ranks of the data values are known. MANN-WHITNEY PROCEDURE If two independent groups, such as in this example, are being compared, the Mann-Whitney U-statistic is calculated. EXAMPLE: MANN-WHITNEY TEST OF TWO INDEPENDENT GROUPS The fertilizer data from the t-test example are used in this example. If you have not already created the database for this data set, do so now by referring to that example. Follow these steps to do this example: Step 1 Open the database: Open the database named FERTILIZ and choose the Non-Parametric Comparisons option from the ANALYZE menu. Or, if you are already in the Non-Parametric module, select the "ChooSe a Database" option and open the FERTILIZ database. Step 2 Select analysis type: From the Non-Parametrics Comparisons menu select "Independent groups - Mann-Whitney, Kruskal-Wallis." Step 3 Select fields to use: Choose GROUP as the grouping variable and OBS (HEIGHT) as the data (response) variable. Step 4 View the results: KWIKSTAT will display the results, including the Mann-Whitney U statistic, the rank sums, sample sizes and mean ranks of the groups, a z statistic and an approximate p-value. In this case, U'=24.00, U = 16, z=0.357 and p=0.721. -------------------------------------------------------------------- Please Become a Registered User 43 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- The p-value of 0.721 is large so the null hypothesis of no difference in medians between groups is not rejected. There is not sufficient evidence based on this procedure to say that there is a difference between the median heights of plants in the two groups grown using different fertilizers. KRUSKAL-WALLIS PROCEDURE If more than two independent groups are being compared using non-parametric methods, KWIKSTAT uses the Kruskal-Wallis test. The database used is of the same form as for the one-way independent gropu analysis of variance. KWIKSTAT will display the Kruskal-Wallis H-statistic, the rank sums, sample sizes and mean ranks of the groups, a chi-square statistic and an approximate p-value, and a graph of the results similar to the one described for the one-way ANOVA. NON-PARAMETRIC REPEATED MEASURES ANALYSIS - FRIEDMAN'S TEST When repeated observations are taken on the same subject, and there is interest in comparing the observations for each repeated measure (e.g., each type of treatment), then a repeated measures analysis may be appropriate. Data for Friedman's test is the same as described in the one-way repeated measures ANOVA. The results include a multiple comparison of groups. NON-PARAMETRIC DICHOTOMOUS DATA ANALYSIS - COCHRAN'S Q Cochran's Q procedure is a non-parametric procedure appropriate for use with dichotomous data when the experiment involves repeated measures on blocks. The response of the subjects to the treatments is dichotomous if it is taken as one of only two possible outcomes, often labeled "success" and "failure", rather than as a measurement. USING REGRESSION & CORRELATION PROCEDURES To examine the linear relationship between variables, correlation and linear regression are used. SIMPLE LINEAR REGRESSION ANALYSIS Data for this example of simple linear regression are Homicide Rate and Handgun Licenses Issued per 100,000 population for the years 1961 to 1973 in Detroit (Fisher, 1976, reprinted from Gunst and Mason, 1980). -------------------------------------------------------------------- Please Become a Registered User 44 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Data for simple linear regression (handgun study) Year Homicide Handguns Rate Registered 1961 8.60 178.15 1962 8.90 156.41 1963 8.52 198.02 1964 8.89 222.10 1965 13.07 301.92 1966 14.57 391.22 1967 21.36 665.56 1968 28.03 1131.21 1969 31.49 837.60 1970 37.39 794.90 1971 46.26 817.74 1972 47.24 583.17 1973 52.33 709.59 Since you want to compare the homicide rate with handguns registered, you need a database with only these two sets of numbers (you can exclude year.) The data for this example is stored on your disk as HANDGUNS.DBF with the variables HOMICIDES and HANDGUNS. See chapter 2 if you need information about how to create a database. To perform a simple linear regression using this data, follow these steps: Step 1 Open the database: Open the database named HANDGUNS. If you are at the main menu, select the Open a Database option from the FILE menu, then choose Regression and Correlation from the ANALYZE menu. If you are in the Regression module, select "ChooSe a Database." Step 2 Select analysis type: From the Regression menu, choose the "Simple Linear Regression" option. Step 3 Select the fields to use: Select HOM_RATE as the DEPENDENT (Y) variable first, then select HAND_REG as the INDEPENDENT (X) variable. Step 4 View the results: KWIKSTAT will perform calculations and display a menu. Choose "View/Print Results," which will display information as shown in figure 4.21. Exit the viewer with F7/Exit. Step 5 View the Plots: View a scatterplot of the original data with the fitted regression line, and a plot of the residual values by choosing "Display Plot" items from the options menu. Step 6 Forecast new values: You may optionally want to predict new values from the calculated regression line. If so, choose that option, and enter the number or range of numbers of the Independent variable you want to use to predict. Pearson's correlation coefficient (r) is reported (0.7263) as well as -------------------------------------------------------------------- Please Become a Registered User 45 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- R2 (R-Square, 0.5275). The linear regression equation given is a mathematical representation of a straight line that passes through a plot of the data, and can be used to predict the dependent variable (HOMICIDES) given a value for the independent variable (HANDGUNS). In this case the linear regression equation is: HOMICIDES = 4.910512 + 3.761144E-02 * HANDGUNS If you want to predict the homicide rate for 300 handguns registered, you would use the equation: HOMICIDES = 4.910512 + 3.761144E-02 * 300 A t-test is performed to test the statistical significance of the linear relationship between the two variables. A low p-value means that the two variables are significantly related. Regression Plots: These plots are helpful to determine if a linear fit to the data is appropriate. The scatterplot shows you how compact or spread apart the points are around the fitted lines, and may help you discover outliers. The residual plot helps you determine if a linear fit is appropriate. Predicting new values: When you choose this option, you will be prompted for one or more X variable values for which you wish to predict Y variable values. MULTIPLE REGRESSION ANALYSIS Multiple regression is an extension of simple linear regression into several dimensions (several independent variables). In the multiple regression procedure, you must enter a list of the independent variables and a single dependent variable on which you wish to perform the regression analysis. In KWIKSTAT you may use up to 10 independent variables in this option. Multiple regression can be complicated. KWIKSTAT calculates and displays several results, including the coefficients and intercept of the regression "line". A significance test is performed to determine the significance of the contribution of the different variables or factors to the model (mathematical representation). EXAMPLE MULTIPLE REGRESSION ANALYSIS (LONGLEY DATA) Longley introduced a data set which has often been used in comparing multiple linear regression procedures in the literature. The variables refer to economic factors. This example uses the LONGLEY database on the KWIKSTAT disk. Follow these steps to perform a multiple linear regression: -------------------------------------------------------------------- Please Become a Registered User 46 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 1 Open the database: Open the database named LONGLEY. If you are at the main menu, select the Open a Database option from the FILE menu, then choose Regression and Correlation from the ANALYZE menu. If you are in the Regression module, select "ChooSe a Database." Step 2 Select analysis type: From the Regression menu, choose the "Multiple Linear Regression" option. Step 3 Select the fields to use: The LONGLEY database consists of 7 fields: DEFLATOR, GNP, UNEMP, ARMED, POP, TIME, and TOTAL. The first six of these will be used as independent variables and the seventh, TOTAL, is the dependent variable (the one to be predicted). Select TOTAL as the DEPENDENT variable and DEFLATOR, GNP, UNEMP, ARMED, POP, TIME as the INDEPENDENT variables, then select "Finished Choosing Fields." Step 4 View the results: KWIKSTAT will perform calculations and display the results as shown in figure 4.22 The table at the top of the output (in Figure 4.22) tells you the intercept value and the coefficient values for each of the independent variables. These can be used to create an equation for prediction of the dependent variable. In this case, the equation is: TOTAL = -3481930.1065 + DEFLATOR*(15.0161517122) + GNP*(-0.03579443400) + UNEMP*(-2.0199053296) + ARMED*(-1.0332049046) + POP*(-0.05130725587) + TIME*(1828.99249535) Note: Although the results are reported to 8 to 9 decimal places, it is usually not appropriate or necessary to use this many decimal places. After viewing or printing the results, exit the viewer. Step 5 View the Plots: From the options menu, you can choose to view residual plots to determine if the data are linear. If the residual plots do not show random patterns, you should determine if there is a transformation you can perform on the data to make it linear. Step 6 Forecast new values: From the options menu, you can choose to predict new values from the calculated regression line. KWIKSTAT also reports R-Square, which gives you a measure of how well the regression "line" fits the data. It is a good idea to view plots of residuals. The plots are helpful to determine if regression analysis is appropriate. A pattern other than a random horizontal band about zero indicates that the assumptions necessary for a regression procedure may be violated. -------------------------------------------------------------------- Please Become a Registered User 47 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- CORRELATION ANALYSIS The correlation coefficient is a measure of the strength of the linear relationship between two variables. KWIKSTAT allows you to find both Pearson's and Spearman's (rank) correlation coefficients of two variables. It also displays the matrix of correlation coefficients of pairs of variables when there are more than two variables being considered. This example uses the Longley data described in the Multiple Regression example above. To display the correlation matrix, use these steps: Step 1 Open the database: Open the database named LONGLEY. Step 2 Select analysis type: From the Regression menu, choose the "Correlation Matrix" option. Step 3 Select the fields to use: The LONGLEY database consists of 7 fields: DEFLATOR, GNP, UNEMP, ARMED, POP, TIME, and TOTAL. Select all the fields for this analysis. KWIKSTAT will perform the calculations and display a 7 by 7 matrix. Only half of the array is displayed since the other half is a mirror image. The diagonal entries are also omitted since they are all one; a variable is always perfectly correlated with itself. Each entry in the array consists of two numbers (three numbers if the information is printed to a printer). The first (upper) is the Pearson's correlation coefficient for the two (row and column) variables of that entry. The second (middle) number, in parentheses, is the p-value of the t-test for Ho: rho = 0 vs. Ha: rho <> 0. In the hard copy printout (if requested), the third (bottom) number, in brackets, is the sample size, or number of paired observations used in the calculations. EXAMPLE GRAPHICAL CORRELATION MATRIX (LONGLEY DATA) This example uses the Longley data used in the previous example. To perform this analysis follow these steps: Step 1 Open the database: Open the database named LONGLEY. Step 2 Select analysis type: From the Regression menu, choose the "Graphical Correlation Matrix" option. Step 3 Select the fields to use: The LONGLEY database consists of 7 fields: DEFLATOR, GNP, UNEMP, ARMED, POP, TIME, and TOTAL. Select all the fields for this analysis. KWIKSTAT will perform the calculations and display the scatterplots. -------------------------------------------------------------------- Please Become a Registered User 48 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- These scatterplots are a visual way of examining the relationships between pairs of variables. It allows you to determine if a relationship exists between the variables, and allows you to see if that relationship is linear. You can use this graphical correlation matrix to examine the relationships between variables before using them in a multiple regression analysis. USING FREQUENCY AND CROSSTABULATION PROCEDURES The Crosstabulations, Frequencies, Chi Square module performs analyses on categorical data, that is, data observed in categories, rather than measurement data. EXAMPLE: FREQUENCY TABLE, BAR AND PIE CHARTS This example uses the EXAMPLE database file. One of the fields (variables) in this database is STATUS referring to socioeconomic status. Suppose you want to know how the total data set is divided up into the five levels of STATUS. You also want to produce a visual display of this information. To perform this analysis follow these steps: Step 1 Open the database: Open the database named EXAMPLE, then choose the Crosstabulations, Frequencies, Chi Square option from the ANALYZE menu. If you are already in the Crosstabulations, Frequencies, Chi Square module, select the "ChooSe Database" option. Step 2 Choose the analysis type: Select the "Frequencies" option from the Crosstabulations, Frequencies, Chi Square module menu . Step 3 Select field: You will be prompted to select one field (variable) to use. Since you want to do a frequency table on STATUS, select STATUS from the field list. Step 4 View results: When you choose to view/print the results, a frequency table is displayed as shown in figure 4.26. Exit the viewer with F7. If you select "Pie Chart," a pie chart, as described earlier in this chapter (Graph Module) will be displayed. If you choose "Bar Chart," a bar chart as described earlier in this chapter will appear. -------------------------------------------------------------------- Please Become a Registered User 49 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- PERFORMING A GOODNESS OF FIT ANALYSIS A goodness-of-fit test of a single population is a test to determine if the distribution of observed frequencies in the sample data closely matches the expected number of occurrences under a hypothetical distribution of the population. According to a genetic theory, crossbred pea plants show a 9:3:3:1 ratio of yellow smooth, yellow wrinkled, green smooth, green wrinkled offspring. Out of 250 plants, under the theoretical ratio (distribution) of 9:3:3:1, you would expect about (9/16)x250=140.625 yellow smooth peas, (3/16)x250=46.875 yellow wrinkled peas (3/16)x250=46.875 green smooth peas (1/16)x250=15.625 green wrinkled peas After growing 250 of these pea plants, you observe that 152 have yellow smooth peas 39 have yellow wrinkled peas 53 have green smooth peas 6 have green wrinkled peas To perform this analysis, use the following steps: Step 1 Select the analysis type: From the Crosstabulations, Frequencies, Chi Square menu choose the "Goodness-of-Fit" option. Step 2 Enter the data: You will be prompted to enter the number of categories. In this case, type 4 for the four categories of peas (yellow smooth, yellow wrinkled, green smooth, green wrinkled) and press Enter. You will also be asked if you want to enter the expected ratios, or if you will be entering the actual expected values into the table. If you choose to enter ratios, you will enter 9,3,3,1 An empty table will appear with the instructions to enter the observed values for each category. Enter the observed values given above, pressing Enter after each entry. For example, for the first row, enter 152 for observed (Press Enter) enter 39 (Press Enter) and so on. KWIKSTAT will perform the calculations (including filling in the expected values column) and display the results. The calculated chi-square statistic in this case is 8.97 and the p-value is 0.031. At a 0.05 level of significance, this p-value indicates that there is enough evidence to reject the null hypothesis that the observed values follow the theoretical distribution. That is, the test (at the 0.05 significance level) suggests that a 9:3:3:1 -------------------------------------------------------------------- Please Become a Registered User 50 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- ratio of yellow smooth:yellow wrinkled:green smooth:green wrinkled peas is not an appropriate distribution for the population from which these data are taken. PERFORMING A CROSSTABULATION ANALYSIS (CHI-SQUARE) Crosstabulations can be used to perform a chi-square test for independence or a chi-square test for homogeneity. A two-way table is constructed that displays the number of counts for each category. Crosstabulation table options you may choose for constructing a table include: A) Frequencies only B) Include Expected Values C) Include Expected Values and Percents D) Include Expected Values, Chi-Contribution and Percents E) Include Percents F) Include Expected Values and Chi-Contribution Q) Quit this option EXAMPLE: 2 BY 2 CROSSTABULATION TEST FOR INDEPENDENCE Data for this example are observations of the number of beetles and bugs on the upper and lower sides of leaves (Zar,1974, page 292). 2 by 2 Contingency Table Data Beetles Bugs --------------- Upper Leaf 12 7 Lower Leaf 2 8 Since you are given only the totals for each of the four categories, and not the individual data for each leaf, there is no need to create a database. Rather, you can just enter these totals from the keyboard. To perform this analysis, follow these steps: Step 1 Choose analysis type: From the Crosstabulations, Frequencies, Chi Square menu, select the "Crosstabulations, Chi-Square" option. You will be asked if you want to "Read data from the database" or "Enter data from the keyboard." For this example, select "K" to enter data from the keyboard. Step 2 Select size of table: You will then be prompted to give the size of the table. When asked for the number of rows, type 2 and press Enter. When asked for the number of columns, again type 2 and press Enter. An empty table will appear with the instructions to enter the counts for each category into the appropriate cell. -------------------------------------------------------------------- Please Become a Registered User 51 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Step 3 Enter the data: Enter the values given above, pressing Enter after each entry. KWIKSTAT will perform the calculations and display the results. The calculated chi-square statistic in this case is 4.89 with a p-value of 0.028. The chi-square with Yates correction is 3.31 with a p-value of 0.069 and the Fisher Exact Test (two tail) has a p-value of 0.050. Because one of the cells produces an expected value less than 5, KWIKSTAT gives a warning that the chi-square analysis for this data may not be valid. Given this warning, it is best to rely on the Fisher's Exact Test for making a decision. EXAMPLE: ANALYZING A LARGER TABLE (SEX BY HAIR COLOR) A generalization of the 2 by 2 table is the R by C (Rows by Columns) table. This is an example (Zar, 1984, page 62) of a two by four contingency table involving the variables hair color and sex. The null hypothesis is that there is no relationship between hair color and sex. 2 by 4 Contingency Table Data (sex by hair color) HAIR COLOR Sex Black Brown Blonde Red ----------------------------------------- Male 32 43 16 9 Female 55 65 64 16 Since you are given only the totals for each of the eight categories, and not the individual data for each person, there is no need to create a database. Rather, you can just enter these totals from the keyboard. To perform this analysis, follow these steps: Step 1 Choose analysis type: From the Crosstabulations, Frequencies, Chi Square menu, select the "Crosstabulations, Chi-Square" option. You will be asked if you want to enter data from a (D)atabase or (K)eyboard. Type K and press Enter. Step 2 Select size of table: You will then be prompted to give the size of the table. When asked for the number of rows, enter 2. When asked for the number of columns, enter 4. An empty table will appear with the instructions to enter the counts for each category into the appropriate cell. Step 3 Enter the data: Enter the values given above, pressing Enter after each entry. KWIKSTAT will perform the calculations and display the results. Step 4 Analyze the results: The calculated chi-square statistic in this case is 8.99 with a p-value of 0.03. A decision can be made using -------------------------------------------------------------------- Please Become a Registered User 52 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- this p-value of the test. A low p-value (less than the chosen significance level) is usually taken to indicate rejection of the null hypothesis. CREATING A 3-D BAR CHART As an option when performing a Crosstabulation, KWIKSTAT allows you to draw a 3-dimensional bar chart of data for a contingency table (crosstabulation), and then to focus in on a part of it if desired. Data for the 3-dimensional bar chart must be entered first, either from the keyboard or a database, by using the "Crosstabulations, Chi-Square" option. MCNEMAR'S TEST McNemar's test is appropriate for use with paired, dichotomous data. This test is sometimes called a test for related samples or a test for the significance of changes. It is useful for comparing paired or related observations in which the response is dichotomous, that is, the response is one of only two possible outcomes. McNemar's test is the 2 by 2 version of Cochran's Q test described in the section on non-parametric tests. The test assumes that any pair of observations is independent of any other pair of observations, although clearly the observations within a pair are not independent of each other. LIFE TABLES AND SURVIVAL ANALYSIS Survival Analysis is used to analyze the survival experience of a group of persons or components. In medical research, survival analysis is helpful is studying the survival of patients under one or more conditions. In industry, the survival may be that of a component such as an electronic switch or a gear. To perform a survival analysis, the data must be in the following form: 1) a TIME variable which contains a time (e.g., minutes, days, years, etc.) in which the subject or component has been observed to be alive (not failed). 2) a CENSOR variable which must take on the values 0 or 1, where 1 means the subject has died (failed), and a 0 means the subject was still alive (not failed) at the last available time period. 3) optionally, a GROUPING variable which may have up to ten values (numeric or character), i.e., the data may be in groups. KWIKSTAT allows you to choose from two types of life tables, Actuarial or Kaplan-Meier. The Actuarial method uses fixed length intervals in the table, and the Kaplan-Meier table uses intervals based on the data. -------------------------------------------------------------------- Please Become a Registered User 53 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Once the data are entered into the program, a life table for each group is produced which includes, for each time interval, the number entered, withdrawn, lost, dead, exposed, the proportion dead, proportion surviving, cumulative proportion surviving, and other information. A plot is given for the cumulative proportion surviving in the group(s) against time. If more than one group is entered, a Mantel-Haenszel log-rank test is performed to test the hypothesis of equal survival patterns for the groups. A reference to how this test is developed is covered in Matthews and Farewell (1988). EXAMPLE: ACTUARIAL LIFE TABLE ANALYSIS The data for this example are in the LIFE database on the KWIKSTAT disk. These data are from Prentice (1973). To perform this analysis, follow these steps: Step 1 Open the database: If you are at the main KWIKSTAT menu, choose the Open Database option from the FILE menu and select the database LIFE, then choose the Life Table and Survival Analysis option from the ANALYZE menu. If you are already in the Life module, open the LIFE database by selecting the "ChooSe database to open" option. Step 2 Choose the analysis type: Select the "Actuarial Life Table Analysis" option from the Life Table and Survival Analysis menu. Step 3 Choose fields to use: The LIFE database consists of 3 fields: SURVIVAL, CENSOR, and GROUP. A portion of the LIFE database is shown here: SURVIVAL CENSOR GROUP 72 1 1 411 1 1 228 1 1 11 1 1 25 0 1 144 1 1 etc... The first column is the SURVIVAL field with entries of length of life, or length of survival. The second column is the CENSOR field, an indicator of whether the subject has failed (died) or not at the last observed time period. 1 means failed, 0 means not failed (still alive). The third column contains a grouping variable. In this case it is either 1 or 2. Group 1 may represent one treatment, while group 2 represents another kind of treatment. The objective is to compute survival curves to see if the treatments provide different average survival distributions. -------------------------------------------------------------------- Please Become a Registered User 54 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Select SURVIVAL as the TIME variable, CENSOR as the censor variable and GROUP as the grouping variable. Step 4 Select interval length: KWIKSTAT reports the names and sizes of the groups and then asks you to specify the length of each interval for the table to be produced. You can specify a desired interval length or you can use the default length by simply pressing Enter. For this example, press Enter to select the default length. Step 5 Analyze the results: KWIKSTAT will perform the calculations and display an options menu. If you choose the View/Print option two sets of tables, one for each group will be displayed. The first table includes the numbers of subjects entered alive, withdrawn, dead, exposed, the proportion dead, proportion alive, cumulative survival proportion and standard error for the first group. The second table includes 95% confidence limits on the cumulative survival proportion. From the table, you can see that, in the first group, 22 of 37 exposed, or 59.5% died in the first interval (0.0-99.0) and two were withdrawn. In the second group, 12 of 23.5 exposed (51.1%) died and one was withdrawn in the first interval. Exit the viewer by selecting F7/Exit. Comparing Survival Curves: At the end of the report, KWIKSTAT reports the results of the Mantel-Haenszel comparison of the two curves. The hypotheses being tested are: Ho: The survival curves are the same. Ha: The survival curves are not the same. In this example, the Mantel-Haenszel comparison procedure results in a chi-square statistic of 0.7191 and a p-value of 0.397. This p-value is much too large to reject the hypothesis of equal curves. This indicates that the two distributions are not statistically significantly different - thus neither treatment is superior in terms of survival distributions. Displaying Survival Curves: From the options menu, you can choose to display survival curves. This is a graphical representation of the cumulative proportion surviving from the life table. The survival curve for this analysis is shown. The Kaplan-Meier life table contains most of the same information as the Actuarial Life Table. However, instead of the time intervals being fixed, the time intervals are based on time values from the data. -------------------------------------------------------------------- Please Become a Registered User 55 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- ========== REFERENCES ========== Box, Jenkins, and Reinsel, Time Series Analysis - Forecasting and Control, Prentice-Hall, 1994. Deming, W.E., Out of Crisis, Cambridge, MA: Massachusetts Institute of Technology, Center for Advanced Engineering Study, 1986 Dixon, W.J. and Massey, F.J., Introduction to Statistical Analysis, McGraw-Hill Book Company, New York, 1969. Elliott, A.C. and Woodward, W.A.,"Analysis of an Unbalanced Two-Way ANOVA on the Microcomputer", Communications in Statistics, Volume B15, Number 1, 1986. Gunst, R.F., and Mason, R.L., Regression Analysis and its Applications, Marcel Dekker, New York, 1980. Granger, C.W.J. and Newbold, P., Forecasting Economic Time Series, Academic Press, 1977. Hoaglin, D.C., Mosteller, F., Tukey, J.W., Understanding Robust and Exploratory Data Analysis, John Wiley & Sons, Inc. New York, 1983. (Box and Whiskers Plots, Stem and Leaf Displays) Kennedy, W. J., Jr., and Gentle, J.E., Statistical Computing, Marcel Dekker, Inc, New York, 1980. Larsen, R.J. and Marx, M.L., Statistics, Prentice-Hall, 1990. Lehmann, E.L. Nonparametrics: Statistical Methods Based on Ranks, Holden-Day, Inc, Oakland, Ca, 1975. Longley, J.W. "An appraisal of least squares programs for the electronic computer from the point of view of the user." JASA, 1967, 62, 819-831. Montgomery, D.C., Introduction to Statistical Quality Control, John Wiley and Sons, 1991. Neter, J., Wasserman, W., and Kutner, M. H., Applied Linear Statistical Models, Richard D. Irwin, Inc., 1990, Third Edition. Prentice, R.L. "Exponential survivals with censoring and explanatory variables.", Biometrika 60, 1973, 279-288. Ryan, Thomas P. Statistical Methods for Quality Improvement, John Wiley & Sons, New York, 1989. Tukey, J.W., Exploratory Data Analysis, Addison-Wesley, 1977. -------------------------------------------------------------------- Please Become a Registered User 56 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- Tsay, R.S., and Tiao, G.C., "Consistent Estimates of Autoregressive Parameters and Extended Sample Autocorrelation Functions for Stationary and Nonstationary ARMA Models," JASA 79, 84-96, 1981. Winer, B.J., Statistical Principles in Experimental Design, Second Edition, McGraw-Hill Book Company, 1971. Woodward, W.A., Elliott, A.C., Gray, H.L and Matlock, D.C., Directory of Statistical Microcomputer Software, Marcel Dekker, New York, 1988. Woodward, W.A., and Gray, H.L., "On the relationship between the S-array and the Box-Jenkins Method of ARMA Model Identification," JASA 76, 579-587, 1981. Zar, J.H., Biostatistical Analysis, Prentice Hall, Inc, 1974 and 1984 editions. -------------------------------------------------------------------- Please Become a Registered User 57 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- =========== APPENDIX A =========== INTERPRETING ERROR CODES While every precaution is taken in writing software, all possibilities of use cannot be anticipated. If the program encounters a problem it does not know how to resolve, it will usually display an error message. This message will contain an error code and a reference code. If you encounter an error message, write it down and refer to this list to see if you can figure out how to resolve the problem. If you are unable to resolve the problem, please write down the steps you took before the error was encountered, and send it to TexaSoft on the Problem Report Form. We will try to resolve the problem as quickly as possible. Error Number 5 = Illegal function call Error Number 6 = Overflow Error Number 7 = Out of Memory Error Number 9 = Subscript out of range Error Number 11 = Division by zero Error Number 14 = Out of String Space Error Number 24 = Device Timeout Error Number 25 = Device fault Error Number 27 = Out of Paper Error Number 50 = FIELD overflow Error Number 51 = Internal Error Error Number 52 = Bad filename or number Error Number 53 = File not found Error Number 54 = Bad file mode Error Number 55 = File already open Error Number 57 = Device I/O error Error Number 58 = File already exists Error Number 61 = Disk full Error Number 62 = Input past end of file Error Number 63 = Bad record number Error Number 64 = Bad filename Error Number 67 = Too many files Error Number 68 = Device unavailable Error Number 70 = Permission denied Error Number 71 = Disk not ready Error Number 72 = Disk media error Error Number 74 = Rename across disks Error Number 75 = Path/File access error Error Number 76 = Path not found Error Number 81 = Invalid filename -------------------------------------------------------------------- Please Become a Registered User 58 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- PROBLEM REPORT FORM: KWIKSTAT Please explain in detail the problem that occurred. If possible, send a printout of the results or Print Screen if possible. KWIKSTAT VERSION#_____________ RELEASE#________________ (see opening screen) KWIKSTAT module where problem occurred:______________________ (It is often helpful if you can indicate the precise commands you used leading up to the problem.) Your computer brand/model_____________________________________ (Also indicate if it is a 8088, 286, 386, 486, Pentium) Monitor type: (circle one) EGA, or VGA (including SuperVGA) Amount of free memory available:____________________ (use the MEM or CHKDSK command to find this out.) Version of DOS you are using:____________________________________ (Use the VER command to find this out.) Memory resident program in use:__________________________________ Running KWIKSTAT from (i.e DOS, Windows, DOS SHELL):__________ PROBLEM: Mail to:TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75106-1169. Or fax to 214-291-3400, or send E-Mail to Compuserve 70721,3145 or Internet 70721.3145@compuserve.com. -------------------------------------------------------------------- Please Become a Registered User 59 KWIKSTAT 4.1 Statistical Data Analysis -------------------------------------------------------------------- USER'S BALLOT Please indicate your preference for improvements in KWIKSTAT. On a scale of 0 to 10 0 = Very Low priority for this change 10 = Very High priority for this change Vote Proposed item of change ---- --------------------------------------------------------- ____ Windows version ____ Add more ANOVA types - example:___________ ____ Add more Non-parametric statistical tests ____ Add General Linear Model ____ Make Report more flexible ____ Add more Quality Control options ____ Cluster Analysis ____ Discriminant Analysis ____ Add other types of analysis - example:__________ ____ _____________________________________________ ____ _____________________________________________ ____ _____________________________________________ ____ _____________________________________________ Other Comments:(Your ideas are very important us.) Mail to:TexaSoft, P.O. Box 1169, Cedar Hill, Texas 75106-1169. Fax to:214-291-3400 or send E-Mail to Compuserve 70721,3145 or Internet 70721.3145@compuserve.com. 60